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Estimacion automatizada del vigor en vinedos 


Resumen 

La estimacién delvigor en las vides (peso de recolecion de uvas / peso de poda), es un 
parametro util que los productores utilizan para prepararse mejor para la cosecha y para 
establecer un plan de agricultura de precision, lograr una mejor planificacion de la zona 
de cultivo, como por ejemplo, poda o fertilizacion. Tradicionalmente, los cultivadores 
obtienen este parametro pesando primero manualmente las cafas podadas durante 
la temporada de inactividad del vihedo (sin hojas); segundo, durante la cosecha, 
recolectando el peso de la fruta en las cepas evaluadas en el primer paso y depues 
correlacionar las dos medidas. Dado que se trata de una tarea muy manual que requiere 
mucho tiempo, los viticultores suelen obtener este numero solo tomando un par de 
muestras y extrapolando este valor a todo el vinedo, perdiendo toda la variabilidad 
presente en sus campos, lo que implica una pérdida de informacidn que puede llevar a 
peor calidad y cantidad de la uva. En este articulo desarrollamos un algoritmo basado 
en visidn por computadora que es robusto a las diferencias en el sistema de trellis, a 
variedades y condiciones de luz ambiental; para estimar automaticamente el peso de 
poda y consecuentemente la variabilidad de vigor dentro del lote. Los resultados se 
utilizaran para mejorar la forma en que los productores planifican la poda anual de 
invierno, avanzando en la transformacién hacia la agricultura de precisidn. Nuestra 
solucién propone crear mapas de prescripcidn (instrucciones detalladas para la poda, 
cosecha y otras decisiones de manejo especificas para la ubicacién) automaticamente, 
basados en el vigor obtenido de procesar las fotografias dela vid. Nuestra solucion utiliza 
técnicas de Deep Learning(DL) par a obtener la segmentacidn de los Arboles de vid 
directamente de la imagen capturada en el campo durante la temporada de inactividad. 
Los resultados muestran que podemos obtener mapas de interpolacién basicamente 
equivalentes entre nuestro método y el conjunto de validacién obtenido ponderando 
manualmente el peso de la poda. 


Palabras clave: Inteligencia Artificial, Segmentacion Imagenes, Automatizacion Agricola 


Abstract 


Estimating the balance or vigor in vines, as the yield to pruning weight relation, is a useful 
parameter that growers use to better prepare for the harvest season and to establish 
precision agriculture man-agement of the vineyard, achieving specific site planification 
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INTRODUCTION 


Recent advances in agricultural management have dramatically improved agri-culture 
around the world with the incorporation of automated process of field data. These 
advances are partially due to the ability to adapt to local factors that influence crop 
yield such as climate, growing region and soil type. As a result, a wide range of plant 


densities and training/trellis systems are used by growe 
In viticulture a primary consideration when se-lecting 
vine vigor. Highly vigorous vines require larger trellising systems, more space o 
vigorating rootstock compared to low-vigor vines. Tradi 
as the vine balance value, is estimated using the RAVAZ 
and researchers by manual weighing the pruning during the dormant sta 


s to optimize harvest practices. 
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a de- 
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e and 
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a first estimate of these values aft 


February, growers can only get 
hs. Another problem with this 


method is that it requires expensive manual labor, since in order to be effective many 


samples needs to be taken in di 


fferent areas in the vineyard to capture all variations 


naturally occurring on every vineyards and therefore allow the management techniques 


to adapt to these variations. 
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Pruning weight 
xx kg 
}) Location 


XXX 


Take a image of avine before After taking a picture, our Computer — Gather the data for the yield Create the prescission vigor and 
prunning Vision Algorithm will segment de evaluate the vigor prescription maps 
vine and provide user with a stimate 
for prunning weight 


Fig. 1. On-Site Vigor Estimation and Vigor Maps 


In this paper rather than using manual methods to estimate vigor that re-quire intensive 
labor we will use algorithms based on computer vision. In our solution we will take images 
with a regular smart mobile phone camera and then perform image segmentation to 
evaluate in-situ the weight of the canes from the segmentation. Our method has several 
advantages over the methods currently in use. First, it doesnt destroy or affect the plant 
in anyway, and second, results will be available immediately after taking the picture. 
Since almost everybody has a smart phone with a camera, growers will be able to take 
a picture of their vine and get immediate feedback with the expected vigor for the 
plant without the need for expensive equipment. This will allow us to create specific 
vigor maps like the ones shown in Figure 1, maps that can be used to adapt the local 
conditions of their vineyards to specific management, and improve not only production 
but also the quality of the harvest. 


Previous Work 


One of the main challenges being studied by the scientific community in viticul-ture 
is early yield prediction, in order to obtain this value directly from images we need to 
do accurate segmentation of the vine. Tree segmentation is particularly difficult since 
trees usually contain lots of texture, the pixels colors of the background are similar to 
the foreground and usually grow in close groups, so it is difficult to differentiate where 
one tree ends. There are several papers that study tree segmentation. In [18] presents 
a solution for tree segmentation from a complex scene. The proposed algorithm is 
mainly composed of a preliminary im-age segmentation, a trunk structure extraction, 
and a leaf regions identification process. Modeling the extraction of trunk structure as 
an optimization problem, where an energy function is formulated according to the 
color, position, and orientation of the segmented regions. In [12], a trunk and branch 
segmentation method were developed using Kinect V2 sensor and deep learning- 
based semantic segmentation. Kinect was used to acquire point cloud data of the tree 
canopies in a commercial apple orchard. Depth and RGB information extracted from the 
point cloud data were used to remove the background trees from the RGB image. Then 
trunk and branches of the tree that share the common appearance and fea-tures were 
segmented out using a convolutional neural network (SegNet) with an accuracy of 0.92. 
In [8] in order to obtain three-dimensional information of the apple branch obstacle, 
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the binocular stereo vision localization method for apple branch obstacle is proposed. 
In [4] studies the ratio between crown and truck diameters on tree images and presents 
a new crown-truck segmentation method, in order to extract automatically the most 
significant feature, the ratio between crown and truck diameters. In [9] a method for 
pruning mass estimation using computer vision is proposed, the images are taken with 
a specific stereo vision camera to create the depth map; their algorithm lacks robustness 
to different light conditions. 


There are also several research areas in agriculture that are using computer vision 
techniques in general to boost productivity of different crops, [1]-[15]. Specific to grapes 
for wine, in article [1] an application is developed to evaluate canopy gaps in vine by 
using computer vision feature extraction algorithms. Canopy porosity is an important 
viticulture factor because canopy gaps favor fruit exposure and air circulation, both 


of which b 
feature ext 
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in general 


different vi 


variations of ou 


enefit fruit quality and health. The algorithm used for this work is based in 
raction which are prone to mistakes when the conditions for light, variety 


nd other fac 
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he number of 


these 
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side con 


There are several papers discussing Deep Learning (DL) 
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on ([ 


differentiat 


1]) where every pixel in the image g 


n our proposed research we don't use expensive equi 
farmers especially the smaller ones is that they don't b 
software. We did talk to several local farms to perform validation of our research on real 
commercial locations and they are very enthusiastic abou 
heir phones to create prescription maps. 
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Fig. 2. Original Image taken on the Vineyard 


Paper Structure 


The remainder of this paper is organized as follows. Section 2 gives an overview of the 
methodology we are proposing for cane segmentation. Section 3 we provide a summary 
of the results obtained, and in Section 4 we present the conclusions and future work. 


METHODOLOGY 


In this section we explain how to obtain a robust segmentation of the vine, as it can be 
seen in Figure 2 when we capture the images with the phone, the background pixels 
contains very similar color pixels to the vine we are trying to segment, which makes the 
background subtraction difficult. Initially, to make sure we have reliable results and to 
be able to do validation, we took two pictures of each of the vines in the vineyards. One 
picture with a white background Figure 3, and a second picture without the background. 
We did also obtained the pruning weight of each of the photographed vines manually. 
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Fig. 3. Vine Segmentation with artificial white background a)No Background b)With background 


These pictures with artificial background are clearly easier to segment since they have 
a big contrast between the vine and the artificial background. Nevertheless, since the 
pictures are taken outside with variable light conditions the solution was not as simple 
as just doing color segmentation, since we needed to first erase some of the shadows, 
poles and different artifacts that produce inaccurate results. After doing a histogram 
color correction we did apply the watershed [10] segmentation algorithm. Results can 
be seen on Figure 4, and with more details in Section 3. 


The accuracy of these segmentation compared with the manual weight is high but there 
is an obvious problem with this approach, it requires an artificial background which is 
at the very least inconvenient since needs two people to get pictures. The results are 
good but we do want to also make the creation of the maps as easy as possible to the 
grower so in the next section we explore how to do the same but without the artificial 
background. 


Segmentation without artificial background 


Deep Learning solutions for image segmentation have been extensively studied in the 
past years, for applications that range from skin cancer detection to au tonomous cars. 
There are several well know segmentation's models that are open source and free to 
use. We tested Mask R-CNN [6] using the implementation in [5] with some modifications 
(since we know the vine should be in the center of the image and should only be one 
segmented per image), the segmentation mask produced overestimated the cane 
weights significantly and was too rough of a segmentation to even be of any use. 
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Fig. 4. Vine Segmentation with artificial white background a)Original b)Segmentation 


Fig. 5. a) Original b) Depth Map Calculated with Disparity or DepthSensor c)Trimap 


The problem is that tree branches are difficult to segment since background pixel color 
are very similar to the foreground. We need a more accurate segmentation similar to 
the ones used for image matting and in particular the solution proposed in [19]. In [19] 
the deep model has two parts. The first part is a deep convolutional encoder-decoder 
network that takes an image and the correspond-ing trimap (image with just three 
colors, background, foreground and border) as inputs and predict the alpha matte of 
the image, Figure 5. The second part is a small convolutional network that refines the 
alpha matte predictions of the first network to have more accurate alpha values and 
sharper edges. 


To be able to use these solution for segmentation we need to obtain the trimaps of the 
vines. Since we have the limitation of only using smart phones to obtain the images , 
© create these trimaps we will use the depth map sensors that comes with most smart 
phones cameras as described next. 


Depth Maps Most smart phones have at least two cameras on the back, these cameras 
allow the implementation of software solutions to create depth maps (image that contains 
information about the distance between the surface of objects from a given viewpoint) 
and with them intelligently blur the background and create professional portrait effects. 
n this paper we use these depth maps from smart phones to create trimaps images. A 
rimap image contains three regions: known background, known foreground, and an 
unknown region. Once we have the trimaps of the vines 2.2 we will train a model that will 
ake the original image and the trimap to accurately segment the vines. 


There are two main ways phones can get depth information. 
1. Disparity. Perception of depth arise from “disparity” of a 3D given point in your left 


and right retina. Disparity is the difference in image location of the same 3D point 
when projected under perspective to two different cameras, Figure 6. 
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Depth Sensor. Most phones will also provide various IR sensors to calculate depth, 
its depth data is much more detailed, particularly at close range. The depth 
estimation works by having an IR emitter send out 30,000 dots arranged in a 


regular pattern 


. They're invisible to people, but not 


deformed pattern as it shows up reflected off surfac 
same type of system used by the original version o 


widely praised 


o the IR camera that reads the 
es at various depths. This is the 
f Microsoft's Kinect, which was 


for its accuracy at the time. The accuracy in the depth map from the 


Depth sensor is better than the disparity maps but it forces the user most of the 
time to use the front facing camera and/or to be c 
the vine) which means we can not have the entire 
some of the latest smart phone models are improving these sensors (example the 


Samsung 20+) 


unfortunately ri 


and getting more and more accura 


ose (with in half a meter from 
plant in one picture. Although 


e results are longer distances; 


ght now with phones on the market we can not use these types 


of sensors for this project. We do expect that in the future this sensors will be the 
standard on smart phones which will make our proposed segmentation even 


more accurate 


and simple to obtain. 


Proposed Algorithm for Segmentation without Artificial Background If smart phone has 
dual camera or depth sensors that will allow the creation of a depth map: 


Threshold the image and anyth 
consider background Then applyi 


on the edge to be the inter-medi 


Use the distance measuring toolbox to make sure the phone camera is at 1.5m of 


the vine 


Capture the image in portrait mode (this way the depth information is saved 


together with t 


Separate the depth informa 


he RGB image) 


ion from the previous image 


Create a Trimap with a simple color segmentation based on the depth image. 


Use alpha matt 


ing that reflects a distance larger than 1.5m will be 
ng asimple canny edge detector, we will classify pixels 
an and the rest will be part of the vine tree 


ing to refine the segmentation of the cane. 


If smart phone doesn't have dual cameras that can be used to generate depth maps. the 


algorithm is as follows: 


Ask user to pla 
he image 


Get the user to 
a Trimap based 


ce itself 1m from 


select points in the image using the 
on the above rough segmentation. 


the vine (if necessary, provide a ruler). - Capture 


method based in [13]. — Create 


Use alpha matting to refine the segmentation of the cane. 
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Fig. 6. Depth Calculation (Z), by using the baseline (distance between two cameras)) 


RESULTS 


To validate our proposed computer vision implementation, for every image of the vine 
taken on the field with the smart phone, we also collected GPS location, the altitude, and 
the pruning weight obtained manually. To obtain this data we did follow the pruning crew 
in the vineyard taking images before they pruned it and collecting and weighting the 
canes of each vine after pruning. A sample subset of the data collected is shown in Table 1. 


Table 1. Data Collected 


Esting Northing Alt Pruning 
Wheight 
1 701294 3940751 387 0.12 
2 701287 3940743 387 0.15 
3 701265 3940741 385 0.12 
4 701244 3940745 382 0.21 
5 701215 3940745 380 0.43 
6 701196 3940742 377 0.76 
7 701210 3940737 377 0.95 
8 701232 3940738 379 0.58 
9 701253 3940738 383 1.21 
10 701272 3940736 382 0.4 
11 701292 3940737 385 0.05 
12 701285 3940732 387 0.22 
13 701263 3940729 381 0.45 
14 701249 3940729 381 1.32 
15 701235 3940722 378 0.85 
16 701249 3940728 382 0.48 
17 701268 3940725 383 0.38 
18 701281 3940728 384 0.19 
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We did use QGIS [16] software to create the pruning weight interpolation maps for 
the entire vineyard, this maps are key in precision agriculture since they present a 
visualization of the data collected simplifying the identification of different yield areas 
(in our case pruning weight). These maps can later be used by the production manager 
or viticulturist to apply different management instructions for the distinct areas and to 
easily compare different years production methods effects on the vineyard. 


n this project we first create the interpolation map using the manual pruning weights, 
in Figure 8. In this map we can see two very different areas in the vineyard, the top right 
one (lower production) and the rest (higher production); this difference maybe due to 
he inclination of the terrain which is very clear in the altitude data. We want to make 
clear that in this project we are not giving or evaluating the reason for the difference 
in areas, we leave this part for a future project were we will take also samples of soil 
and other data, the main goal of the project is to create the map automatically and 
faster than the traditional (manual) way. Therefore validation for our project consist on 
providing and equivalent map to the one shown in Figure 7. 


Manual Pruning Weight 
HB low 

1) Median Low 

(1) Median High 

BB High 


Fig. 7. Pruning Weight Interpolation Map using Manual Pruning Weight 


We did also visually inspect all the images and assign a weight from 1-10 to each of 
the vines and created a second interpolation map, Figure 8. Value 1 is assigned for 
lower pruning weight and 10 means high pruning weight. The purpose of this visual 
inspection was to first get familiarized with the input images, in machine learning is very 
important to really know and understand your input data, and second to find out if a 
human can provide a basic estimate of the weight based on just images. If a human can 
find the pattern than a Machine learning algorithm should be able to do the same given 
the same input, the images. 
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Fig. 8. Pruning Weight Visual Evaluation. Scale 1 (low pruning weight) to 10 (high pruning weight 


There are small differences between Figure 7, manual pruning weight, and Figure 8, 
visual inspection of the images. The reason for the differences is that some of the vines 
canes where left without being pruned to be used as guides for cordons on next season; 
but for most parts the maps are equivalent for determining different yield zones in the 
orchard 


Results Using Images with artificial Background and Watershed 
Segmentation 


Using the segmentation algorithm described in 2.1 we obtain the pruning weight map 
from Figure 9. As it can be seen from both interpolation maps, Figure 7 and 9, the areas 
of different pruning weight are basically equivalent. 


There are still some differences but is mainly due to the different pruning techniques 
applied to some vines were instead of pruning all the canes they left two big ones as 
cordons or guides for next year. These method was not consistently applied to all vines and 
herefore the variation. In future we will add to the data collected the individual pruning 
method for the vine. All of the vineyard that we collected data from are commercial 
facilities and they have some inconsistencies like the one mentioned, sometimes they 
eave cordons and sometimes they take entire branches instead of just the new growth 
canes (shoots). These doesn't mean the segmentation is wrong, the differences are due to 
he automatic segmentation only measuring the weight of new growth while manually 
he pruning crew will sometimes take more than weight of the vine. 
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With Background 
Hi Low 
10 Medium Low 
{5 Medium High 
HB High 


Fig. 9. Pruning Weight Interpolation Map using Watershed Segmentation on Images With Artificial Background 


Results Using Images without artificial Background and Alpha Matte 
Segmentation 


Using the segmentation algorithm described in 2.2 we obtain the pruning weight map 
in Figure 10, 


As it can be seen from the images the interpolation maps is basically equiv-alent to the 
one obtained by manually pruning each of the vines, which proves that the automatic 
segmentation we are proposing works with the great advantage of not needing to 
use an artificial background or intensive manual labor to create the maps. The small 
inconsistencies in this map are same as for the images with artificial background and 
due to same reason, some of the vines were left with two new cordons which means the 
weight of those are not included on the manual pruning weight but they are evaluated 
on the automatic image based algorithms. 
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no background 
Hl Low 

{3 Medium low 
Medium high 
MB High 


Pruning Weight Interpolation Map using Watershed Segmentation on Images With Artificial Background 


Results Using Specific Depth Map Hardware Sensors 


We did also perform some test on the Intel real sense D435 camera [7]. This test was done 
to compare depth maps obtained with a smart phone camera to a more specific depth 
sensor as the ones in real sense. The real sense camera requires the installation of specific 
software on a desktop computer, and the user will need to take the camera and the 
computer to the field to take the pictures of the vines; which can be very inconvenient 
and for many farmers too complicated to even try. Nevertheless we wanted to compare 
results. Since real sense is specific hardware for depth map we expected the quality 
of the depth maps to be better than the ones on we obtained with the smart phone 
camera. The preliminary results obtained can be seen in Figure 11. 


depth map Original Image 


DepthMap obtained by Using Real Sense Camera 
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From Figure 11 we observe that the depth map sensor has trouble creating an accurate 
depth map of the vine and misses most of the canes (shoots). We think the main 
problem is that the real sensor was designed to work on indoor environment with 
constant lightning, but it doesn't work well on outside with variations on the lightning 
conditions. We need to further test this Intel sensors since the latest version is more 
accurate and uses lidar technology to create the depth map and the accuracy is of 
around 10 meters, although all are designed for indoor use mainly. Right now, with 
version D435, the results cant be used for our project, plus the system is inconvenient to 
take to the field; which confirms that our original idea of doing this with a regular phone 


camera is the best option. 


CONCLUSION AND FUTURE WORK 


According to the California Department of Food and Agriculture the total in-vestment in 
Al technology in agriculture was around $5 billions on 2017. This number is expected to 
grow even more in the next years. The NSF has recently issued a call for grant proposals 
o found Al Research Institutes in the USA specifically mentioning the track in Al-Driven 
nnovation in Agriculture and the Food System [14]. Why is it so important to get more 
Al in agriculture? Recent advances in agricultural management have dramatically 
improved agriculture around the world. These advances are partially due to the ability to 
adapt to local factors that influence crop yield such as climate, growing region and soil 
ype. The main reason why farmers do not adapt already to local conditions is because 
hey don't have the knowledge or tools to do precision agriculture. Al techniques will 
help by automatizing this techniques and therefore making them cheaper and available 
‘o small farmers. 


n our particular viticulture project, a wide range of plant densities and training/trellis 
systems are used by growers to optimize harvest practices. A primary consideration 
when selecting the proper trellis system is the vine vigor. In this project we successfully 
implemented the segmentation of grape vines to estimate this vigor automatically 
providing the necessary information to the grower in a timely manner. 


The software used (written in python) is still being developed and therefore is not 
available yet as open source; but if the readers are interested in obtaining a copy for 
research contact the author of the paper for a copy. 
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